Representation Optimization with Feature Selection and Manifold Learning in a Holistic Classification Framework
نویسندگان
چکیده
Many complex and high dimensional real-world classification problems require a carefully chosen set of features, algorithms and hyperparameters to achieve the desired generalization performance. The choice of a suitable feature representation has a great effect on the prediction performance. Manifold learning techniques – like PCA, Isomap, Local Linear Embedding (LLE) or Autoencoders – are able to learn a better suitable representation automatically. However, the performance of a manifold learner heavily depends on the dataset. This paper presents a novel automatic optimization framework that incorporates multiple manifold learning algorithms in a holistic classification pipeline together with feature selection and multiple classifiers with arbitrary hyperparameters. The highly combinatorial optimization problem is solved efficiently using evolutionary algorithms. Additionally, a multi-pipeline classifier based on the optimization trajectory is presented. The evaluation on several datasets shows that the proposed framework outperforms the Auto-WEKA framework in terms of generalization and optimization speed in many cases.
منابع مشابه
Automatic Representation and Classifier Optimization for Image-based Object Recognition
The development of image-based object recognition systems with the desired performance is – still – a challenging task even for experts. The properties of the object feature representation have a great impact on the performance of any machine learning algorithm. Manifold learning algorithms like e.g. PCA, Isomap or Autoencoders have the potential to automatically learn lower dimensional and mor...
متن کاملImage Classification via Sparse Representation and Subspace Alignment
Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...
متن کاملUnderstanding the Interplay of Simultaneous Model Selection and Representation Optimization for Classification Tasks
The development of classification systems that meet the desired accuracy levels for real world-tasks applications requires a lot of expertise. Numerous challenges, like noisy feature data, suboptimal algorithms and hyperparameters, degrade the generalization performance. On the other hand, almost countless solutions have been developed, e.g. feature selection, feature preprocessing, automatic a...
متن کاملA Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization
Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کامل